monte carlo search
Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems
Monte Carlo Search gives excellent results in multiple difficult combinatorial problems. Using a prior to perform non uniform playouts during the search improves a lot the results compared to uniform playouts. Handmade heuristics tailored to the combinatorial problem are often used as priors. We propose a method to automatically compute a prior. It uses statistics on solved problems. It is a simple and general method that incurs no computational cost at playout time and that brings large performance gains. The method is applied to three difficult combinatorial problems: Latin Square Completion, Kakuro, and Inverse RNA Folding.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Leisure & Entertainment > Games (0.94)
Generalized Nested Rollout Policy Adaptation with Limited Repetitions
Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. We propose to improve on GNRPA by avoiding too deterministic policies that find again and again the same sequence of choices. We do so by limiting the number of repetitions of the best sequence found at a given level. Experiments show that it improves the algorithm for three different combinatorial problems: Inverse RNA Folding, the Traveling Salesman Problem with Time Windows and the Weak Schur problem.
- Leisure & Entertainment > Games (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.52)
- Information Technology > Artificial Intelligence > Games > Go (0.47)
Adaptive Monte Carlo Search for Conjecture Refutation in Graph Theory
Vito, Valentino, Stefanus, Lim Yohanes
Graph theory is an interdisciplinary field of study that has various applications in mathematical modeling and computer science. Research in graph theory depends on the creation of not only theorems but also conjectures. Conjecture-refuting algorithms attempt to refute conjectures by searching for counterexamples to those conjectures, often by maximizing certain score functions on graphs. This study proposes a novel conjecture-refuting algorithm, referred to as the adaptive Monte Carlo search (AMCS) algorithm, obtained by modifying the Monte Carlo tree search algorithm. Evaluated based on its success in finding counterexamples to several graph theory conjectures, AMCS outperforms existing conjecture-refuting algorithms. The algorithm is further utilized to refute six open conjectures, two of which were chemical graph theory conjectures formulated by Liu et al. in 2021 and four of which were formulated by the AutoGraphiX computer system in 2006. Finally, four of the open conjectures are strongly refuted by generalizing the counterexamples obtained by AMCS to produce a family of counterexamples. It is expected that the algorithm can help researchers test graph-theoretic conjectures more effectively.
- Europe > Croatia > Zagreb County > Zagreb (0.05)
- Asia > Indonesia (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.92)
Improving few-shot learning-based protein engineering with evolutionary sampling
Jawaid, M. Zaki, Yeo, Robin W., Gautam, Aayushma, Gainous, T. Blair, Hart, Daniel O., Daley, Timothy P.
Designing novel functional proteins remains a slow and expensive process due to a variety of protein engineering challenges; in particular, the number of protein variants that can be experimentally tested in a given assay pales in comparison to the vastness of the overall sequence space, resulting in low hit rates and expensive wet lab testing cycles. In this paper, we propose a few-shot learning approach to novel protein design that aims to accelerate the expensive wet lab testing cycle and is capable of leveraging a training dataset that is both small and skewed ($\approx 10^5$ datapoints, $< 1\%$ positive hits). Our approach is composed of two parts: a semi-supervised transfer learning approach to generate a discrete fitness landscape for a desired protein function and a novel evolutionary Monte Carlo Markov Chain sampling algorithm to more efficiently explore the fitness landscape. We demonstrate the performance of our approach by experimentally screening predicted high fitness gene activators, resulting in a dramatically improved hit rate compared to existing methods. Our method can be easily adapted to other protein engineering and design problems, particularly where the cost associated with obtaining labeled data is significantly high. We have provided open source code for our method at https:// github.com/SuperSecretBioTech/evolutionary_monte_carlo_search.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France (0.04)
Solving the HP model with Nested Monte Carlo Search
Roucairol, Milo, Cazenave, Tristan
In this paper we present a new Monte Carlo Search (MCS) algorithm for finding the ground state energy of proteins in the HP-model. We also compare it briefly to other MCS algorithms not usually used on the HP-model and provide an overview of the algorithms used on HP-model. The algorithm presented in this paper does not beat state of the art algorithms, see PERM (Hsu and Grassberger 2011), REMC (Thachuk, Shmygelska, and Hoos 2007) or WLRE (W\"ust and Landau 2012) for better results. Hsu, H.-P.; and Grassberger, P. 2011. A review of Monte Carlo simulations of polymers with PERM. Journal of Statistical Physics, 144 (3): 597 to 637. Thachuk, C.; Shmygelska, A.; and Hoos, H. H. 2007. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinformatics, 8(1): 342. W\"ust, T.; and Landau, D. P. 2012. Optimized Wang-Landau sampling of lattice polymers: Ground state search and folding thermodynamics of HP model proteins. The Journal of Chemical Physics, 137(6): 064903.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.96)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.74)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)
Measurable Monte Carlo Search Error Bounds
Mern, John, Kochenderfer, Mykel J.
Monte Carlo planners can often return sub-optimal actions, even if they are guaranteed to converge in the limit of infinite samples. Known asymptotic regret bounds do not provide any way to measure confidence of a recommended action at the conclusion of search. In this work, we prove bounds on the sub-optimality of Monte Carlo estimates for non-stationary bandits and Markov decision processes. These bounds can be directly computed at the conclusion of the search and do not require knowledge of the true action-value. The presented bound holds for general Monte Carlo solvers meeting mild convergence conditions. We empirically test the tightness of the bounds through experiments on a multi-armed bandit and a discrete Markov decision process for both a simple solver and Monte Carlo tree search.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Downhole Track Detection via Multiscale Conditional Generative Adversarial Nets
Li, Jia, Wei, Xing, Yang, Guoqiang, Sun, Xiao, Li, Changliang
Frequent mine disasters cause a large number of casualties and property losses. Autonomous driving is a fundamental measure for solving this problem, and track detection is one of the key technologies for computer vision to achieve downhole automatic driving. The track detection result based on the traditional convolutional neural network (CNN) algorithm lacks the detailed and unique description of the object and relies too much on visual postprocessing technology. Therefore, this paper proposes a track detection algorithm based on a multiscale conditional generative adversarial network (CGAN). The generator is decomposed into global and local parts using a multigranularity structure in the generator network. A multiscale shared convolution structure is adopted in the discriminator network to further supervise training the generator. Finally, the Monte Carlo search technique is introduced to search the intermediate state of the generator, and the result is sent to the discriminator for comparison. Compared with the existing work, our model achieved 82.43\% pixel accuracy and an average intersection-over-union (IOU) of 0.6218, and the detection of the track reached 95.01\% accuracy in the downhole roadway scene test set.
- Asia > China > Anhui Province > Hefei (0.05)
- Asia > Japan > Shikoku > Tokushima Prefecture > Tokushima (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- Transportation > Ground > Rail (0.46)
- Transportation > Ground > Road (0.34)
How AlphaZero Works
Recently I posted about the phenomenal performance of the AlphaZero algorithm in computer chess. For the first time in history, an algorithm displayed human-like understanding of chess. AlphaZero seemed to understand what moves were best and spent its time focusing only on them. It didn't mechanically crunch through millions of possible positions, run out of time, and then select the best move. The best moves emerged from its computer neural network, like a human grandmaster. It was given just the rules of chess and nine hours to play itself 44 million games, and then it learned something so deep about chess that it crushed the world champion computer chess program, Stockfish, 155 games to 6. (They played 1,000 total games; at this level most games are draws.)
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
Yu, Lantao (Shanghai Jiao Tong University) | Zhang, Weinan (Shanghai Jiao Tong University) | Wang, Jun (University College London) | Yu, Yong (Shanghai Jiao Tong University)
As a new way of training generative models, Generative Adversarial Net (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data. However, it has limitations when the goal is for generating sequences of discrete tokens. A major reason lies in that the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model. Also, the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is non-trivial to balance its current score and the future one once the entire sequence has been generated. In this paper, we propose a sequence generation framework, called SeqGAN, to solve the problems. Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update. The RL reward signal comes from the GAN discriminator judged on a complete sequence, and is passed back to the intermediate state-action steps using Monte Carlo search. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over strong baselines.
- Leisure & Entertainment (1.00)
- Media > Music (0.47)